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PACKAGING CELL LINES FOR HIV-DERIVED RETROVIRAL VECTOR PARTICLES 

BACKGROUND OF THE INVENTION 

Retroviral vectors based on lentiviruses, such as human immunodeficiency 
viruses (HIV), can infect nondividing cells, and integration of pro viral DNA occurs 
5 without the need for cell division. These properties make lentiviruses attractive for gene 
transfer into nondividing cells, such as hepatocytes, myofibers, hematopoietic stem 
cells, and neurons. 

However, the use of lentivirus vectors, particularly HIV vectors, particularly for 
gene therapy, is hampered by concern over their safety. Thus, a need for the 
10 development of lentivirus vectors, particularly HIV vectors, with improved safety, 
particularly for gene therapy, exists. 

SUMMARY OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HIV -derived, 

15 retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
manunalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a 
preferred embodiment, the packaging cell lines of the present invention are stable 

20 packaging cell lines. 

In one embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpoL wherein said coding sequence has 
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been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins. 

In second embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles 
5 comprise (a) a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in 
the cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a second retroviral nucleotide sequence in the cell 
which comprises the coding sequence for a heterologous envelope protein. 

10 In a third embodiment of the invention, packaging cell lines for producing a viral 

accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpol, wherein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 

1 5 proteins; (c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and (d) a third retroviral 
nucleotide sequence which comprises a DNA sequence of interest and lentivirus cis- 
acting sequences required for packaging, reverse transcription and integration. 

In a fourth embodiment of the invention, packaging cell lines for producing a 

20 viral accessory protein independent lentivirus-derived retroviral vector particles 

comprise (a) a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the 
cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a retroviral nucleotide sequence which comprises a 

25 DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration. 

In a fifth embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
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(e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins. 

In sixth embodiment of the invention, packaging cell lines for producing a viral 
5 accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
(e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol^ wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 
(c) a second retroviral nucleotide sequence in the cell which comprises the coding 

1 0 sequence for a heterologous envelope protein. 

In a seventh embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 

1 5 codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; (c) 
a second retroviral nucleotide sequence in the cell which comprises the coding sequence 
for a heterologous envelope protein; and (d) a third retroviral nucleotide sequence which 
comprises a DN A sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

20 In a eighth embodiment of the invention, packaging cell lines for producing a 

viral accessor\' protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 

25 (c) a retroviral nucleotide sequence which comprises a DNA sequence of interest and 
HIV cis-acting sequences required for packaging, reverse transcription and integration. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a retroviral nucleotide sequence which comprises a codon optimized gag 
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coding sequence and (2) a retroviral nucleotide sequence which comprises a codon 
optimized pol coding sequence, in place of the retroviral nucleotide sequence which 
comprises a codon optimized gagpol coding sequence. 

In a particular embodiment, the heterologous envelope protein is the G 
5 glycoprotein of vesicular stomatitis virus (VSV G). In another embodiment, the 
heterologous envelope protein is the amphotropic envelope of the Moloney leukemia 
virus (MLV). 

Cell lines for producing a viral accessory protein independent lentivirus-derived 
retroviral vector particles are produced by transfecting host cells (e.g., mammalian host 

1 0 cells) with a plasmid comprising a DNA sequence which encodes lentivirus gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the lentivirus gagpol proteins. Depending upon the particular 
cell line being produced, the host cells are also co-transfected with a plasmid 
comprising a DNA sequence which encodes a heterologous envelope protein, or a 

1 5 plasmid comprising a DNA sequence of interest and lentivirus cis-acting sequences 
required for packaging, reverse transcription and integration, or both of these plasmids. 
Alternatively, host cells are transfected v^th a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the plasmid 

20 comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
Cell lines for producing a viral accessory protein independent HlV-derived 
retroviral vector particles are produced by co-transfecting host cells (e.g., mammalian 
host cells) with a plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 

25 improve expression of the HIV gagpol proteins. Depending upon the particular cell line 
being produced, the host cells are also co-transfected with a plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein, or a plasmid comprising a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
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transcription and integration, or both of these plasmids. Alternatively, host cells are 
transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
HIV gag protein and a plasmid comprising a codon optimized DNA sequence encoding 
a HIV pol protein, in place of the plasmid comprising a codon optimized DNA sequence 
5 encoding both HIV gagpol proteins. 

The present invention also relates to methods of producing viral accessory 
protein independent lentivirus-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes lentivirus gagpol proteins, wherein said DNA sequence 

1 0 has been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and lentivirus cis-acting sequences required for packaging, reverse transcription 
and integration. Alternatively, host cells are transfected with a plasmid comprising a 

15 codon optimized DNA sequence encoding a lentivirus gag protein and a plasmid 

comprising a codon optimized DNA sequence encoding a lentivirus pol protein, in place 
of the first plasmid comprising a codon optimized DNA sequence encoding both 
lentivirus gagpol proteins. 

In a particular embodiment, the invention relates to methods of producing viral 

20 accessory protein independent HIV-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes HIV gagpol proteins, wherein said DNA sequence has 
been codon optimized by mutagenisis to improve expression of the HIV gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 

25 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcription and 
integration. Alternatively, host cells are transfected with a plasmid comprising a codon 
optimized DNA sequence encoding a HIV gag protein and a plasmid comprising a 
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codon optimized DNA sequence encoding a HIV pol protein, in place of the first 
plasmid comprising a codon optimized DNA sequence encoding both HIV gagpol 
proteins. 

The present invention also relates to viral accessory protein-independent ~ 
5 retroviral particles produced by or obtainable by (obtained by) the methods described 
herein. 

The present invention further relates to isolated DNA encoding a codon 
optimized lentivirus gagpol, isolated DNA encoding the gag coding region of a codon 
optimized lentivirus gagpol, and isolated DNA encoding the pol coding region of a 

10 codon optimized lentivirus gagpol. In a particular embodiment, the present invention 
relates to isolated DNA encoding a codon optimized HIV gagpol, isolated DNA 
encoding the gag coding region of a codon optimized HIV gagpol, and isolated DNA 
encoding the pol coding region of a codon optimized HIV gagpol. 

The packaging cell lines and viral particles of the present invention can be used 

15 for gene therapy or gene replacement with improved safety. The packaging cell lines 
and viral particles of the present invention can also be used in development and 
production of vaccines, and in production of biochemical reagents. Gene therapy 
vectors produced with the cell lines of the present invention are expected to be valuable 
medical therapeutics. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a schematic diagram of an expression cassette containing the codon 
optimized gagpol genes. The DNA was constructed in multiple segments, which are 
indicated at the top as 1/3, 2/3, 3/3 (A, B, C and D) and HIN. Restriction sites used to 
assemble the cloned segments are indicated above the kilobasepair (Kb) ruler. Below 

25 the ruler are multiple features showing the location of the human cytomegalovirus 
(CMV) promoter, human betaglobin sequences (Bglobin), mRNA sequences (thinner 
line represents intronic sequence), the gag and pol open reading frames, the individual 
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proteolytic fragment coding sequences (pl7_MA, p24_CA. p7, p6, PR, p51__RT, 
RNaseH and integrase (IN)) and each synthetic oligonucleotide used in the assembly 
process (multiple adjacent open arrows). 

Figure 2 is a table which depicts codon usage frequencies in genes which are ~ 
5 highly expressed and in the codon optimized gagpol open reading frame of the HIV 
packaging construct described herein. 

Figure 3 is a schematic representation of the HIV provirus and a three-piasmid 
expression system used for generating a pseudotyped HIV-based vector by transient 
transfection as described in Naldini et al. Science, 272:262-261 (1 996). 
10 Figure 4 is a list of some characteristics relating to the HIV Rev protein. 

Figure 5 is a list of some points relating to codon optimization of HIV gagpoL 

Figure 6 is a partial DNA sequence of HIV gag (SEQ ID NO: 1), showing 
inactivation of inhibitory sequences as described in Schwartz, S. et ai, J. Virol, 
5(?(12):7176"7182 (1992). 
1 5 Figure 7 a plot of the %(G+C) content of wildtype HIV gagpol sequences and 

theoretically codon optimized HIV gagpol sequences. The percent of bases, either G or 
C, was calculated for a 30 nucleotide moving window for the entire length of the gagpol 
gene, and the value plotted versus nucleotide position. Diamonds = HIV gagpol 
sequences; squares = full optimal back-translation fox gag open reading frame; 
20 triangles = full optimal back-translation for pol open reading frame; CO = codon 
optimized. 

Figures 8A-8E depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the gag coding region of a wildtype HIV gagpol and a codon 
optimized HIV gagpoL "NL40 genbank.SEQ" indicates the nucleotide sequence (SEQ 
25 ID N0:2) and predicted amino acid sequence (SEQ ID N0:3) for the gag coding region 
of a wildtype HIV gagpol "pHDMHgpm2,seq" indicates the nucleotide sequence (SEQ 
ID N0:4) and predicted amino acid sequence (SEQ ID N0:5) for the gag coding region 
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of a codon optimized HIV gagpoL The "NL4.3 genbank.SEQ" sequences are publicly 
available at the NIH GenBaiik sequence repository (Accesssion No. Ml 9921). 

Figures 9A-9L depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for thepo/ coding region of a wildtype HIV gagpol and a codon" 
5 optimized fflV gagpo!, "NL4-3 genbank,SEQ" indicates a nucleotide sequence (SEQ 
ID N0:6) and a predicted amino acid sequence (SEQ ID N0:7) for the pol coding 
region of a wildtype HIV gagpo! available in the NIH GenBank sequence repository 
(Accesssion No. Ml 992 1). The nucleotide and amino acid sequences for the pol coding 
region available in the GenBank sequence repository contain two sequence errors, 

10 which are indicated in Figures 9A.9L with shading. "pNL4-3.seq" indicates the correct 
nucleotide sequence (SEQ ID N0:8) and predicted amino acid sequence (SEQ ID 
NO:9) for (he pol coding region of a wildtype HIV gagpoL "pHDMHgpm2.seq" 
indicates the nucleotide sequence (SEQ ID NO: 10) and predicted amino acid sequence 
(SEQ ID N0:1 1) for the pol coding region of a codon optimized HIV gagpol 

15 Figures lOA-lOD depict the DNA sequence (SEQ ID NO; 12) for pHDMHgpm2. 

The CMV enhancer/promoter is at nucleotides 97 to 679, human betaglobin sequences 
(Bglobin) are at nucleotides 761 to 864, 865 to 1303 and 5710 to 6469 (end of Bglobin 
is at nucleotdes 6445 to 6469), mRNA sequences are at nucleotides 680 to 778 and 1255 
to 5921, SV40 origin of repHcation is at nucleotides 8796 to 8908. beta-lactamase (bla) 

20 coding region is at nucleotides 6709 to 7569, intron sequences are at nucleotides 779 to 
1254, the codon optimized gag coding region is at nucleotides 1 3 1 8 to 2820, the codon 
optimized po/ coding region is at nucleotides 2619 to 5624 and the poly A site is at 
nucleotides 5897 to 5921. 

Figure 1 1 is a circular map of plasmid pHDMHgpml. 

25 DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HlV-derived, 
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retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a — 
5 particular embodiment, the packaging cell lines of the present invention are stable 
packaging cell lines. 

The cell lines are engineered to express the lentivirus proteins necessary for 
virus particle formation (gagpol proteins), without containing DNA sequences from 
lentivirus accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev response 

10 element (RRE)). Additionally, no viral sequences (such as cis-actinc elements termed 
constitutive transport elements (CTEs)) will be expressed as RNA of any kind. DNA 
sequences for lentivirus gagpol are codon optimized by extensively mutagenizing the 
sequences to improve expression and to reduce the risk of recombination between 
transfer vector sequences and gagpol messenger RNA. This greatly improves the safety 

1 5 of virus preparations generated from these cell lines. In a particular embodiment, the 
DNA sequences for lentivirus gagpol are not codon optimized in the overlap region 
between the gag and pol sequences and in cis-acting signals necessary for translation of 
pol. 

Examples of lentiviruses include human immunodeficiency viruses (e.g., HIV-I, 
20 HIV-2, HIV-3), bovine lentivimses (e.g., bovine immunodeficiency viruses, bovine 
immunodeficiency-like viruses, Jembrana disease viruses), equine lentiviruses (e.g., 
equine infectious anemia viruses), feline lentiviruses (e.g., feline immunodeficiency 
viruses, panther lentiviruses, puma lentiviruses), ovine/caprine lentiviruses (e.g., 
Brazilian caprine lentiviruses, caprine arthritis-encephalitis viruses, Maedi-Visna 
25 viruses, Maedi-Visna-like viruses, Maedi-Visna-related viruses, ovine lentiviruses^ 
Visna lentiviruses), Simian AIDS retroviruses (e.g., human T-cell lymphotropic virus 
type 4), simian immunodeficiency viruses, simian-human immunodeficiency viruses, 
human lymphotrophic viruses (e.g., type III), simian T-cell lymphotrophic viruses. 
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in another embodiment, cell lines are engineered to express the HIV proteins 
necessary for virus particle formation (gagpol proteins), without containing DNA 
sequences from fflV accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev 
response element (RRE)). Additionally, no viral sequences (such as cis-acting elements 
5 termed constitutive transport elements (CTEs)) will be expressed as RNA of any kind. 
DNA sequences for a HIV gagpol are codon optimized by mutagenesis to improve 
expression and to reduce the risk of recombination between transfer vector sequences 
and gagpol messenger RNA. In a particular embodiment, the DNA sequences for HIV 
gagpol are not codon optimized in the overlap region between the gag and pal 

1 0 sequences and in cis-acting signals necessary for translation of pol. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a nucleotide sequence which comprises a codon optimized gag coding 
sequence and (2) a nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the nucleotide sequence which comprises a codon optimized 

1 5 gagpol coding sequence. In tiiis embodiment, the gag and pol coding sequences can be 
completely codon optimized 

Benefits of the present invention include the removal of potentially harmful 
lentivirus accessory proteins and other viral sequences, and the reduction of the risk of 
recombination to produce replication competent virus. 

20 Packaging cell lines for producing a viral accessory protein independent 

lentivirus-derived retroviral vector particles comprise a mammalian cell and a retroviral 
nucleotide sequence comprising a coding sequence for a lentivirus gagpol which has 
been codon optimized. In a particular embodiment the packaging cell lines further 
comprise a retroviral nucleotide sequence comprising a coding sequence for a 

25 heterologous envelope protein. In a second embodiment, the packaging cell lines 
further comprise a retroviral nucleotide sequence comprising a coding sequence for a 
heterologous envelope protein and a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
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transcription and integration. In third embodiment, the packaging cell Imes further 
comprise a retroviral nucleotide sequence which comprises a DNA sequence of interest 
and HIV cis-acting sequences required for packaging, reverse transcription and 
integration, Alternatively, the packaging cell lines of the present invention comprise a" 
5 retroviral nucleotide sequence which comprises a codon optimized gag coding sequence 
and (2) a retroviral nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the retroviral nucleotide sequence which comprises a codon 
optimized gagpol coding sequence. 

The coding sequence(s) for lentivirus gagpol which has (have) been codon 

1 0 optimized results in improved expression of the lentivirus gagpol proteins and reduces 
the risk of recombination between the transfer vector and gagpol messenger RNA. 
Codon optimization of the coding sequence(s) for lentivirus gagpol was obtained by 
mutagenizing for each particular amino acid residue, specific nucleic acid bases in a 
codon for the particular amino acid residue to a nucleic acid base which is present in a 

1 5 codon which occurs at a high frequency in genes which are highly expressed for the 
same amino acid residue. In a particular embodiment, the resulting optimized codon 
also does not cause introduction of mRNA splicing signals into the codon optimized 
sequence. Thus, in a particular embodiment, codon optimization of the coding 
sequence(s) for lentivirus gagpol is obtained by mutagenizing for each particular amino 

20 acid residue, specific nucleic acid bases in a codon for the particular amino acid residue 
to a nucleic acid base that is present in a codon which (1) occurs at a high frequency in 
genes which are highly expressed for the same amino acid residue and (2) does not 
cause introduction of mRNA splicing signals into the codon optimized sequence. 
Codon optimization typically results in the removal of nucleic acid base A-rich 

25 instability elements. 

In a particular embodiment, the coding sequence for a HIV gagpol (pNL4-3; 
available through the AIDS repository, NIK; Adachi et al, J. Virol, 59:284-291 (1986)) 
has been codon optimized to improve translational efficiency of the HIV gagpol 
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proteins and reduce the risk of recombination between the transfer vector and HIV 
gagpol messenger RNA. Two hundred thirty-seven base pairs (237 bp) consisting of the 
gag pol overlap and cis-acting signals necessary for translation of pol (nucleotides 2583 
to 2819 of SEQ ID NO: 12) were not optimized. The HIV gagpol sequence obtained" 
5 using the codon optimization process does not differ at the amino acid level from the 
wildtype HIV gagpol sequence, but differs at the nucleotide level from the HIV gagpol 
sequence. A codon optimized HIV gag sequence is shown in Figures 8A-8E 
(pHDMHgpm2.seq) (SEQ ID N0:4). A codon optimized HIV pol sequence is shown in 
Figures 9A-9L (pHDMHgpm2.seq) (SEQ ID NO: 10), 

10 A plasmid comprising DNA sequences which encode codon optimized lentivirus 

gagpol proteins is also referred to herein as a packaging construct. This plasmid 
includes a promoter which drives the expression of the gagpol proteins, such as the 
human cytomegalovirus (hCMV) inunediate early promoter. This plasmid is defective 
for the production of the viral envelope and accessory proteins tat, vif, vpr, vpu, nef and 

15 rev and the Rev response element (RRE), The packaging construct also does not 

contain viral sequences which are transcribed into mRNA, such as constitutive transport 
elements (CTEs). 

A packaging construct comprising a codon optimized HIV gagpol is depicted in 
Figure 1 and in Figure 11. Figures lOA-lOD depict the DNA sequence (SEQ ID 

20 NO: 1 2) for the packaging construct pHDMHgpm2. This packaging construct 
(pHDMHgpm2) was constructed as follows: Plasmid pMDA.HIVgp mam was 
generated by chemical synthesis and PGR assembly (which is described in, for example, 
Stcmmer et ai, Gene, 164:49-53 (1995)) of 215 different oligonucleotides. The DNA 
sequence for pMDA.HIVgp mam is the same as the DNA sequence for 

25 pMDA.HIVgp jtg except for 4.3 kb which was codon optimized using the DNAStar 
program (LaserGene, Madison, WI). Two hundred thirty-seven base pairs (237 bp) 
consisting of the gag pol overiap and cis-acting signals necessary for translation of pol 
(nucleotides 2583 to 2819 of SEQ ID NO: 12) were not optimized due to dual reading 
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frame constraints. A Nsil site 5' of IN was preserved to aid fusion with wildtype 
sequences. Several single or double base pair silent mutations were introduced either to 
prevent potential splice donors and acceptors, or by the synthesis process, 
pMDA.HIVgp jtg was derived from HIV-1 strain NL4-3. The protease mutation tharis 
5 present in the NL4-3 NIH GenBank sequence was then repaired (Figure 9B), changing 
the nucleotide present at position 2948 of SEQ ID NO: 12 from a "G" to a "C", thereby 
producing the codon present at nucleotide positions 2948 to 2950 of SEQ ID NO: 1 2 
which encodes an arginine instead of the glycine present in the NL4-3 GenBank amino 
acid sequence. The resulting piasmid was named pMDHgpmam. The EcoRI-Hindlll 

10 fragment of pMDHgpmam was inserted into pHDM2b. a high copy version of the pMD 
vector (Or>', D. et aL, Proc. Natl Acad. ScL USA, 93(21 ): 1 1400-1 1406 (1996)), to 
produce piasmid pHDMHgpm. The sequencing mutation that is present in the RNase 
domain of the NL4-3 NIH GenBank sequence was repaired (Figure 9H), changing the 
codon present at nucleotide positions 4724 to 4726 of SEQ ID NO: 12 from "GGG" to 

15 "AAG", thereby producing a codon encoding a lysine instead of the glycine present in 
the NL4-3 GenBank amino acid sequence. The resulting piasmid was named 
pHDMHgpm2. Codon usage frequencies in the codon optimized gagpol open reading 
frame of the packaging construct pHDMHgpm2 are shown in Figure 2. 

As used herein, a heterologous envelope protein permits pseudotyping of 

20 particles generated by the packaging construct and includes the G glycoprotein of 
vesicular stomatitis virus (VSV G) and the amphotropic envelope of the Moloney 
leukemia virus (MLV). A piasmid comprising a DNA sequence which encodes a 
heterologous envelope protein is also referred to herein as an envelope coding piasmid. 
The terms "mammar' and "mammalian", as used herein, refer to any vertebrate 

25 animal, including monotremes, marsupials and placental, that suckle their young and 
either give birth to living young (eutharian or placental mammals) or are egg-laying 
(metatharian or nonplacental mammals). Examples of mammalian species include 
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humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, 
guinea pigs) and ruminents (e.g., cows, pigs, horses). 

Examples of mammalian cells include human (such as HeLa cells, 293T cells, 
NIH 3T3 cells), bovine, ovine, porcine, murine (such as embryonic stem ceils), rabbit"^ 
5 and monkey (such as COSl ceils) cells. The cell may be a non-dividing cell (including 
hepatocytes, myofibers, hematopoietic stem ceils, neurons) or a dividing cell. The cell 
may be an embryonic cell, bone marrow stem cell or other progenitor cell. Where the 
cell is a somatic cell, the cell can be, for example, an epithelial cell, fibroblast, smooth 
muscle cell, blood cell (including a hematopoietic cell, red blood cell, T-celL B-celL 

1 0 etc.), tumor cell, cardiac muscle cell, macrophage, dendritic cell, neuronal cell (e.g., a 
glial cell or astrocyte), or pathogen-infected cell (e.g., those infected by bacteria, 
viruses, virusoids, parasites, or prions). 

Typically, cells isolated from a specific tissue (such as epithelium, fibroblast or 
hematopoietic cells) are categorized as a "cell-type." The cells can be obtained 

15 commercially or from a depository or obtained directly from an animal, such as by 
biopsy. Alternatively, the cell need not be isolated at all from the animal where, for 
example, it is desirable to deliver the virus to the animal in gene therapy. 

To produce the cell lines of the present invention for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particles, mammalian host cells 

20 are co-transfected with (a) a first plasmid comprising DNA sequence which encode 
lentivirus gagpol proteins, wherein said DNA sequence has been codon optimized by 
mutagenisis, as described above, to improve expression of the lentivirus gagpol 
proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 

25 DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration, or both, under conditions appropriate for 
transfection of the cells. 
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In a particular embodiment, to produce the cell lines of the present invention for 
producing viral accessory protein independent HIV-derived retroviral vector particles 
mammalian host ceils were cotransfected with (a) a first plasmid comprising DNA 
sequence which encode HIV gagpol proteins, wherein said DNA sequence has been " 
5 codon optimized by mutagenisis, as described above, to improve expression of the HIV 
gagpol proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
transcription and integration, or both, under conditions appropriate for transfection of 
10 the cells. 

Virus stocks consisting of viral accessory protein independent lentivirus-derived, 
particularly HIV-derived, retroviral vector particles of the present invention are 
produced by maintaining the transfected cells under conditions suitable for virus 
production (e.g., in an appropriate growth media and for an appropriate period of time). 

15 Such conditions, which are not critical to the invention, are generally known in the art. 
See, e.g., Sambrook et ai, Molecular Cloning: A Laboratory Manual, Second Edition, 
Cold Spring Harbor University Press, New York (1989); Ausubel et aL, Current 
Protocols in Molecular Biology, John Wiley 8l Sons, New York (1998); U.S. Patent 
No. 5,449,614; and U.S. Patent No. 5,460,959, the teachings of which are incorporated 

20 herein by reference. 

To generate viral accessory protein independent lentivirus-derived retroviral 
vector particles, mammalian host cells can be co-transfected with (a) a first plasmid 
comprising DNA sequence which encode lentivirus gagpol proteins, wherein said DNA 
sequence has been codon optimized by mutagenisis, as described above, to improve 

25 expression of the lentivirus gagpol proteins; (b) a second plasmid comprising a DNA 
sequence which encodes a heterologous envelope protein; and (c) a third plasmid 
comprising a DNA sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. Alternatively, mammalian cells are 
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transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
lentivirus gag protein and a plasmid comprising a codon optimized DNA sequence 
encoding a lentivirus pol protein, in place of the first plasmid comprising a codon 
optimized DNA sequence encoding both lentivirus gagpol proteins. Alternatively, "~ 
5 mammalian host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the first plasmid 
comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
In a particular embodiment, the invention relates to methods of producing viral 

10 accessory protein independent HIV-derived retroviral vector particles, comprising co- 
transfecting mammalian host cells with (a) a first plasmid comprising DNA sequence 
which encode HIV gagpol proteins, wherein said DNA sequence has been codon 
optimized by mutagenisis, as described above, to improve expression of the HIV gagpol 
proteins; (b) a second plasmid containing a DNA sequence which encodes a 

1 5 heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcription and 
integration. Alternatively, mammalian host cells are transfected with a plasmid 
comprising a codon optimized DNA sequence encoding a HIV gag protein and a 
. plasmid comprising a codon optimized DNA sequence encoding a HIV pol protein, in 

20 place of the first plasmid comprising a codon optimized DNA sequence encoding both 
HIV gagpol proteins. 

Virus particles produced by the methods described herein, using a codon 
optimized HIV packaging construct produced as described herein, were compared by 
Western analysis with virus particles produced as described in Naldini et ai, Science, 

25 272:263-267 (1996), using the packaging construct plasmid pCMVAR8.2. Both the 
immunological reactivity and the proteolytic processing were confirmed to be 
indistinguishable. 
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A plasmid comprising a DNA sequence of interest and HIV cis-acting sequences 
required for packaging, reverse transcription and integration is also referred to herein as 
a transfer vector, A transfer vector, as used herein, refers to a vehicle which is used to 
introduce a DNA of interest into a eurkaryotic cell, particulariy a mammalian cell " 
5 Figure 3 depicts an example of a transfer vector. 

DNA sequence of interest, as used herein, include all or a portion of a gene or 
genes encoding a nucleic acid product whose expression in a cell or a mammal is 
desired. In a particular embodiment, the nucleic acid product is a heterologous 
therapeutic protein. Examples of therapeutic proteins include antigens or immunogens, 

10 such as a polyvalent vaccine, cytokines, tumor necrosis factor, interferons, interleukins, 
adenosine deaminase, insulin, T-cell receptors, soluble CD4, growth factors, such as 
epidermal growth factor, human growth factor, insulin-like growth factors, fibroblast 
growth factors), blood factors, such as Factor VIII, Factor IX, cytochrome b, 
glucocerebrosidase, ApoE, ApoC, ApoAL the LDL receptor, negative selection markers 

15 or "suicide proteins", such as thymidine kinase (including the HSV, CMV, VZV TK), 
anti-angiogenic factors, Fc receptors, plasminogen activators, such as t-PA, u-PA and 
streptokinase, dopamine, MHC, tumor suppressor genes such as p53 and Rb, 
monoclonal antibodies or antigen binding fragments thereof, drug resistance genes, ion 
channels, such as a calcium channel or a potassium channel, adrenergic receptors, 

20 hormones (including growth hormones) and anti-cancer agents. In another embodiment, 
the nucleic acid product is a gene product to be expressed in a cell or a mammal and 
which product is otherwise defective or absent in the cell or mammal. For example, the 
nucleic acid product can be a functional gene(s) which is defective or absent in the cell 
or mammal. 

25 DNA sequence of interest includes DNA sequences (control sequences) which 

are necessary to drive the expression of the gene or genes. The control sequences are 
operably linked to the gene. The term "operably linked", as used herein, is defined to 
mean that the gene is linked to control sequences in a manner which allows expression 
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of the gene (or the nucleic acid sequence). Generally, operably linked means 
contiguous. 

Control sequences include a transcriptional promoter, an optional operator 
sequence to control transcription, a sequence encoding suitable mRNA ribosomal ~~ 
5 binding sites and sequences which control termination of transcription and translation. 
In a particular embodiment, a recombinant gene encoding a desired nucleic acid product 
can be placed under the regulatory conu-ol of a promoter which can be induced or 
repressed, thereby offering a greater degree of control with respect to the level of the 
product produced. 

1 0 As used herein, the term "promoter" refers to a sequence of DNA, usually 

upstream (5') of the coding region of a structural gene, which controls the expression of 
the coding region by providing recognition and binding sites for RNA polymerase and 
other factors which may be required for initiation of transcription. Suitable promoters 
are well known in the art. Exemplary promoters include the SV40, CMV and human 

1 5 elongation factor (EFI) promoters. Other suitable promoters are readily available in the 
art (see, e.g., Ausubel et ai, Current Protocols in Molecular Biology, John Wiley & 
Sons, Inc., New York (1998); Sambrook et aL, Molecular Cloning: A Laboratory 
Manual, 2nd edition, Cold Spring Harbor University Press, New York (1 989); and U.S. 
Patent No, 5,681,735). 

20 A DNA sequence of interest can be isolated from nature, modified from native 

sequences or manufactured de novo, as described in, for example, Ausubel et aL, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York (1998); and 
Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring 
Harbor University Press, New York. (1989). DNA sequences can be isolated and ftised 

25 together by methods known in the art, such as exploiting and manufacturing compatible 
cloning or restriction sites. 

The packaging cell lines and viral particles of the present invention can be used, 
in vitro, in vivo and ex vivo, to introduce DNA of interest into a eukaryotic cell (e.g., a 
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mammalian cell) or a manunal (e.g., a human or other mammal or vertebrate). The cells 
can be obtained commercially or from a depository or obtained directly from a mammal, 
such as by biopsy. The cells can be obtained from a mammal to whom they will be 
returned or from another/different mammal of the same or different species. For 
5 example, using the packaging cell lines or viral particles of the present invention, DNA 
of interest can be introduced into nonhuman cells, such as pig cells, which are then 
introduced into a human. Alternatively, the cell need not be isolated from the mammal 
where, for example, it is desirable to deliver vial particles of the present invention to the 
mammal in gene therapy. 

10 Ex vivo therapy has been described, for example, in Kasid et al., Proc, Nad. 

Acad ScL USA, 57:473 (1990); Rosenberg et al, Engl J. Med,, 323:570 (1990); 
Williams et al., Nature, 570:476 (1984); Dick ei al, Cell, 42:1\ (1985); Keller et al. 
Nature, 318: 149 (1985); and Anderson et a!., United States Patent No. 5,399,346, 

Methods for administering (introducing) viral particles directly to a mammal are 

1 5 generally known to those practiced in the art. For example, modes of administration 
include parenteral, injection, mucosal, systemic, implant, intraperitoneal, oral, 
intradenmal, transdermal (e.g., in slow release polymers), intramuscular, intravenous 
including infusion and/or bolus injection, subcutaneous, topical, epidural, etc. Viral 
particles of the present invention can, preferably, be administered in a pharmaceutically 

20 acceptable carrier, such as saline, sterile water, Ringer's solution, and isotonic sodium 
chloride solution. 

The dosage of a viral particle of the present invention administered to a 
mammal, including frequency of administration, will vary depending upon a variety of 
factors, including mode and route of administration; size, age, sex, health, body weight 
25 and diet of the recipient mammal; nature and extent of symptoms of the disease or 
disorder being treated; kind of concurrent treatment, frequency of treatment, and the 
effect desired. 
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The teachings of all the articles, patents, patent applications and GenBank 
sequences cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art thaT 
various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims, 
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CLAIMS 

What is claimed is: 

1 . A packaging cell line for producing a viral accessory protein independent HIV- 
derived retroviral vector particle comprising: 

5 a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
1 0 coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

2. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
15 the G glycoprotein of vesicular stomatitis virus (VS V G), 

3. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

4. A packaging cell line of Claim 1 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

20 5. A packaging cell line comprising: 
a) a mammalian cell; 
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a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 
a second retroviral nucleotide sequence in the cell which comprises a " 
DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

A packaging cell line of Claim 5 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

A method of producing a packaging cell line for producing a viral accessory 
protein independent HIV-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gag and pol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 
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c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration, 

9. A method of Claim 8 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

1 0. A method of Claim 8 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

11. A method of Claim 8 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

12. A method of producing a viral accessory protein independent HIV-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

13. A method of Claim 12 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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1 4. A method of Claim 12 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

15. A method of Claim 1 2 wherein the DNA sequence of interest encodes a — 
heterologous therapeutic protein. 

5 16. A packaging cell line for producing a viral accessory protein independent 
lentivirus-derived retroviral vector particle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a lentivirus gagpol, wherein said coding sequence 

10 has been mutagenized to improve expression of the lentivirus gagpol 

proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
15 sequence of interest and lentivirus cis-acting sequences required for 

packaging, reverse transcription and integration. 

17. A packaging cell line of Claim 16 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

18. A packaging cell line of Claim 16 wherein the heterologous envelope protein is 
20 the amphotropic envelope of the Moloney leukemia virus. 



19. 



A packaging cell line of Claim 16 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 
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20. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence-has 
been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required 
for packaging, reverse transcription and integration. 

21. A packaging cell line of Claim 20 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 

22. A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

23. A method of producing a packaging cell line for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gag and pol proteins; 
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b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and- 
integration. 

24. A method of Claim 23 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

25. A method of Claim 23 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

26. A method of Claim 23 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

27. A method of producing a viral accessory protein independent lentivirus-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration. 

28. A method of Claim 27 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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29. A method of Claim 27 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

30. A method of Claim 27 wherein the DNA sequence of interest encodes a - 
heterologous therapeutic protein, 

5 31. A viral accessory protein independent HIV-derived retroviral vector particle 

produced by the method comprising co-transfecting mammalian host ceils with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 

proteins, wherein said DNA sequence has been mutagenized to improve 

expression of the HIV gagpol proteins; 
10 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequences required for packaging, reverse transcription and 

integration. 

1 5 32. A method of Claim 31 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

33. A method of Claim 31 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 



34. 

20 



A method of Claim 3 1 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 
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35. A viral accessory protein independent lentivirus-derived retroviral vector 
particle produced by the method comprising co-transfecting mammalian host 
cells with; 

a) a first plasmid comprising a DNA sequence which encodes lenti virus— 
5 gagpol proteins, wherein said DNA sequence has been mutagenized to 

improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
1 0 cis-acting sequences required for packaging, reverse transcription and 

integration. 

36. A method of Claim 35 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

37. A method of Claim 35 wherein the heterologous envelope protein is the 
15 amphotropic envelope of the Moloney leukemia virus. 

38. A method of Claim 35 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

39. Isolated DNA encoding a codon optimized HIV gagpol. 

40. Isolated DNA encoding a codon optimized HIV gag. 

20 41. Isolated DNA of Claim 40 comprising the nucleotide sequence of SEQ ID N0:4. 



42. 



Isolated DNA encoding a codon optimized HIV pol. 
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43 . Isolated DNA of Claim 42 comprising the nucleotide sequence of SEQ ID 
NO:ia 

44. A method of introducing a DNA sequence of interest into a mammal comprising 
introducing into said mammal a viral accessory protein independent HIV- 

5 derived retroviral vector particle comprising the DNA sequence of interest. 

45. The method of Claim 44 wherein the mammal is a human. 

46. The method of Claim 44 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

47. A method of introducing a DNA sequence of interest into a mammal comprising 
10 the steps of: 

a) introducing into cells a viral accessory protein independent HlV-derived 
retroviral vector particle comprising the DNA sequence of interest; and 

b) returning the cells obtained in step a) to the mammal. 

48. The method of Claim 47 wherein the mammal is a human. 

15 49. The method of Claim 47 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 
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Rev 

• Regulates HTV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
mRNAs 

• Postulated to affect splicing, stability, transport, 
and translation 



Fig. 4 
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Codon Optimization of HIV gagpol 

• Remove A-rich instability elements 

• Improve translational efficiency 

• Reduce risk of recombination with 
transfer vector 



Fig. 5 
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Alignment Report of Codon optimization {gag).MEG. using Clustal method with PAIV1250 residue weight table. 



810 



"^^^ ^ ^ A S V L S G G E L 5 k"mt^ -a u , 

792^ ATG GGT GCG AGA GCG TCG GTA TTA AGC GGG GGA GAA TTA GAT AAA 9«^^"J<-SEQ 

1319 ATG GGC GCC CGC GCC TCC GTG C^G TCC GGC GGC G^G CTG GAC A^G ^"^'^^P^ * 



840 

L_ 

837 WEKIRLRPG 



— I — 
870 



"^»^-tRLRPGGKKO V V xiT^ o ^ . 
837^ TGG GAA AAA ATT CGG TTA AGG CCA GGG GGA AAG AAA CAA TAT aL S^'^ank.SEQ 

1364 TGG GAG aIg ATC CGC C^G CGC CCC GGC GGC A^G aIg CAG TAC flo P"''""9Pm2 , seq 



1 

900 



982 L K H I V W A S R £ l F R p 

882 CTA AAA CAT ATA GTA TGG GCA AGC AGG GAG CTA GAA CGA TTC G^ 9e:±-nk.SEQ 

1409 LKHIVMASRELERPi 

1409 CTG AAG CAC ATC GTG TGG GCC TCC CGC GAG CTG GAG CGC TTC GCC ^""'"''P'^ " 



1 

930 



— , — 
960 



527 V N P G L L i T S E G C R n 7~ 

927 GTT AAT CCT GGC CTT TTA GAG ACA TCA GAA GGC TGT AGA CAA ATA "^"^''^ ^^^-^'SEQ 

1454 VNPGLLETSEGCROT 

1454 GTG AAC CCC GGC CTG CTG GAG ACC TCC GAG GGC TGC CGC CAG ATC """""^P"^ " ^ 



990 

972 LGQLQPSLQ 



972 CTG GGA CAG CTA CAA CCA TCC C^T C^G ACA GGA TCA gL GM C^ 
1499 LGQLQPS LOTGSi^ft 

14 99 CTG GGC CAG CTG CAG CCC TCC CTG CAA ACC GGC TCC GAG GAG CTG ^^^^^^^'^^^ 



— 1 ^ 

1020 loSO 



-L 



^SLYNTIAVLYCVH O N' 4 3 h t ^-o 
1011 AGA TCA TTA TAT AAT ACA ATA GCA GTC CTC TAT TGT GTG CAT CAA genDank.Sr.Q 

^^^^ RSLYNTIAVLYCVHQ pHDMHQpm2 sea 
154 4 CGC TCC CTG TAC AAC ACC ATC GCC GTG CTG TAC TGC GTG CAC CAG • s 

*" ' 1 • 

1080 

^^^^ — ~ — — — ' 

"eg '^'^ '^'^ '^'^^ '^'^^ *^ ""'^ genbank.SEQ 

1589 CGC ATC GAC GTG AAG GAC ACC AAG GAG GCC C^G G^C A^G ATC GAG ^""""'P'^ ' ""^ 

1140 

2^ 2^Q^ ~ — I 

lell ^ '^'^ '"^ ^= AAA AAG GCA C^G CAA GCA G^ GCT "^"'^ genbank.SEQ 

1634 GAG GAG C^G A^C J[g TCC J^G A^G GCC C^G C^G GCC GCC GCC """""'^'^ " 
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Alignment Report of Codon optimization (gag).MEG. using Clustal method with PAM250 residue weight table. 



1152 
1152 
1679 
1679 



1170 
I 



DTGNNSQVSQNYPIv KL4-3 genbank . SEQ 
GPiC ACA GGA AAC AAC AGC CAG GTC AGC CAA AAT TAC CCT ATA GTG 

DTGNNSQVSQNYPIV pHDMHgpm2 . seq 
GAC ACC GGC AAC AAC TCC CAG GTG TCC CAG AAC TAC CCC ATC GTG — 



1200 



1230 
I 



1197 QNLQGQMVHQA'ISPR 

1197 CAG AAC CTC CAG GGG CAA ATG GTA CAT CAG GCC ATA TCA CCT AGA 

1724 QNLQGQMVHQAISPR 

1724 CAG AAC CTG CAG GGC CAG ATG GTG CAC CAG GCC ATC TCC CCC CGC 



NL4-3 genbank. SEQ 
pHDMHgpm2 . seq 



1287 
1287 
1814 
1814 



1260 
I 



1242 TLNAWVKVVEEKAFS NL4-3 genbank. SEQ 

1242 ACT TTA AAT GCA TGG GTA AAA GTA GTA GAA GAG AAG GCT TTC AGC 

1769 TLNAWVK VVEEKAFS pHDMHgpm2 . seq 

1769 ACC CTG AAC GCC TGG GTG AAG GTG GTG GAG GAG AAG GCC TTC TCC 



1 

1290 



1320 



PEVIPMFSALSEGAT 
CCA GAA GTA ATA CCC ATG TTT TCA GCA TTA TCA GAA GGA GCC ACC 

PEVIPMFSAL SEGAT 
CCC GAA GTC ATC CCC ATG TTC TCC GCC CTG TCC GAG GGC GCC ACC 



NL4-3 genbank. SEQ 
pHDMHgpm2 . seq 



1332 
1332 
1859 
1859 



1422 
1422 
1949 
1949 



14 67 
1467 
1994 
1994 



1350 
1 



PQDLNTMLNTVGGHQ 
CCA CAA GAT TTA AAT ACC ATG CTA AAC ACA GTG GGG GGA CAT CAA 

PQDLNTMLNTVGGHQ 
CCC CAG GAC CTG AAC ACC ATG CTG AAC ACC GTG GGC GGC CAC CAG 



1 

1380 



1 

1410 



NL4-3 genbank. SEQ 
pHDMHgpni2. seq 



1377 AAMQMLKETINEEAA NL4-3 genbank. SEQ 

1377 GCA GCC ATG CAA ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT CCA 

1904 AAMQMLKETINEEAA pHDMHgpm2 . seq 

1904 GCC GCC ATG CAG ATG CTG AAG GAG ACC ATC AAC GAG GAG GCC GCC 



^ 

1440 
1 



EWDRLHPVHAGPIAP 
GAA TGG GAT AGA TTG CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 

EWDRLHPVHAGPIAP 
GAG TGG GAC CGC CTG CAC CCC GTG CAC GCC GGC CCC ATC GCC CCC 



NL4-3 genbank. SEQ 
pHDMHgpm2 . seq 



, 

1470 



1 

1500 
l_ 



GQMREPRGS DIAGTT NL4-3 genbank. SEQ 
GGC CAG ATG AGA GAA CCA AGG GGA AGT GAC ATA GCA GGA ACT ACT 

GQMREPRGS DIAGTT pHDMHgpm2 , seq 
GGC CAG ATG CGC GAG CCC CGC GGC TCC GAC ATC GCC GGC ACC ACC 



Fig, 8B 



wo 00/15819 



PCT/US99/20675 



10/29 



Alignment Report of Codon optimizaUon (g3g).MEG. using Clustal method with PAM250 residue weight table. 



1512 
1512 
2039 
2039 



1530 



STLQEQIGWMTHNPp NM-3 genbank.SEQ 
AGT ACC CTT CAG GAA CAA ATA GGA TGG ATG ACA CAT AAT CCA CCT 

STLQEQIGWMTHNPP pHDMHgpm2 . seq 
TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 



1560 



1590 
I 



1557 IPVGEIYKRWIILGL NL4-3 genbank.SEQ 

1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 

2084 IPVGEIYKRWIILGL pHDMHgpra2 . seq 

2084 ATC CCC GTG GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 



1647 
1647 
2174 
2174 



1620 



1602 NKIVRMYSPTS ILDI 

1602 AAT AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 

2129 NKIVRMYSPTSILDI 

2129 AAC AAG ATC GTG CGC ATG TAC. TCC CCC ACC TCC ATC CTG GAC ATC 



^ 

1650 



1680 
I 



RQGPKEPFRDYVDRF 
AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TAT GTA GAC CGA TTC 

RQGPKEPFRDYVDRF 
CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 



NL4-3 genbank.SEQ 
pHDMHgpm2 , seq 

NL4-3 genbank.SEQ 
pHDMHgpm2 . seq 



1710 



1692 YKTLRAEQASQEVKN NL4-3 genbank.SEQ 

1692 TAT AAA ACT CTA AGA GCC GAG CAA GCT TCA CAA GAG GTA AAA AAT 
2219 YKTLRAEQASQEVKN pHDMHgpm2 . seq 

2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 



1737 
1737 
2264 
2264 



1740 
1 



1770 
I 



WMTETLLVQNANPDC NL4-3 genbank.SEQ 
TGG ATG ACA GAA ACC TTG TTG GTC CAA PJ^.T GCG AAC CCA GAT TGT 

WMTETLLVQNANPDC pHDMHgpm2 . seq 
TGG ATG ACC GAG ACC CTG CTG GTG CAG AAC GCC AAC CCC GAC TGC 



1800 



1782 
1782 
2309 
2309 



KTILKALGPGATLEE 
AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 

KTILKALGPGATLE E 
AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 



NL4-3 genbank.SEQ 
pHDMHgpm2 . seq 



1830 



1860 



1827 MMTACQGVGGPGHKA NL4-3 genbank.SEQ 

1827 ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA. 

2354 MMTACQGVGGPGHKA pHDMHgpm2 . seq 

2354 ATG ATG ACC GCC TGC CAG GGC GTG GGC GGC CCC GGC CAC AAG GCC 
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Alignment Report of Codon optimization {gag).MEG, using Clustal method with PAM250 residue weight table. 



1890 



NL4-3 genbank.SEQ 



1872 RVLAEAMSQVTNPAT 

1872 AGA GTT TTG GCT GAA GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 

2399 RVLAEAMSQVTNPAT pHDMHgpm2 . se<J- 

2399 CGC GTG CTG GCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 



1917 
1917 
2444 
2444 



1962 
1962 
2489 
2489 



1920 
l_ 



1950 



^MIQKGNFRNQRKTV NL4-3 genbank.SEQ 
ATA ATG ATA CAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 

IMIQKGNFRNQR KTV pHDMKgpm2 . seq 
ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 



1980 



NL4-3 genbank.SEQ 



KCFNCGKEGHIAKNC 
AAG TGT TTC AAT TGT GGC AAA GAA GGG CAC ATA GCC AAA AAT TGC 

KCFNCGKEGHIAKNC oHDMHgpm2 . sec 
AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC AAG AAC TGC 



2010 

- ' 



2040 



2007 RAPRKKGCWKCGKEG NL4-3 genbank.SEQ 

2007 AGG GCC CCT AGG AAA AAG GGC TGT TGG AAA TGT GGA AAG GAA GGA 
2534 RAPRKKGCWKCGKEG DHDMHgpm2 . seq 

2534 CGC GCC CCC CGC AAG AAG GGC TGC TGG AAG TGC GGC AAG GAG GGC 



2070 



2052 HQMKDCTERQANFLG ML4-3 genbank.SEQ 

2052 CAC CAA ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 

2579 HQM KDCTERQANFI.G 

257 9 CAC CAG ATG AAA GAT TGT ACT GAG AGA CAG GCT AAT TTT TTA GGG 



2097 
2097 
2624 
2624 



pHDMHgpm2 . seq 



2100 



1 

2130 



KIWPSH KGRPGNFIQ 
AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 

KI WPSHKGRPGNFLQ 
AAG ATC TGG CCT TCC CAC AAG GGA AGG CCA GGG AAT TTT CTT CAG 

1 — — 



NL4-3 genbank.SEQ 
pHDMHgpmS .seq 



2160 



NL4-3 genbank.SEQ 



2142 SRPEPTAPPEESFRF 

2142 AGC AGA CCA GAG CCA .ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 

2669 SRPEPTAPPEESFRF pHDMHgpm2 . seq 

2669 AGC AGA CCA GAG CCA ACA GCC CCA CCA GAA GAG AGC TTC AGG TTT 



1 

2190 



2220 



21B7 
2187 
2714 
2714 



GEETTTPSQKQEPID NL4-3 genbank.SEQ 
GGG GAA GAG ACA ACA ACT CCC TCT CAG AAG CAG GAG CCG ATA GAC 

<^EETTTPSQKQEPID pHDMHgpm2 . seq 
GGG GAA GAG ACA AC^ ACT CCC TCT CAG ?<AG CAG GAG CCG ATA GAC 



Fig. 8D 



wo 00/15819 



PCT/US99/20675 



12/29 



Alignment Report of Codon optimization (oag).MEG, using Qustal method with PAM250 residue weight table. 



2250 
J. 



^ ^ ^ ^ ^ A i Z 5 i — i; — } — g : 5 , 

2232 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC ^^'^ank.SEQ 
2759 KELYPLASLRSLF G — 
2759 AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC ^"^^^P"^ ' 
1 . 

2260 
J. 



2277 D P S S Q 

2277 6AC CCC TCG TCA CAA TAA ^^^'^ genbanJc.SEQ 

2804 D P S S Q . 

2804 GAC CCC TCG TCA CAA TAA pHDMHgpm2 . seq 
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Alignment Report of Codon Optimization (pol}.MEG, udng Clustal method with PAM250 residue weight table. 



2090 



2120 
i_ 



208*7 FFREDLAFPQGKARE 

2087 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

2085 FFREDLAFPQGKARE 

2085 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 

2612 FFREDLAFPQGKARE 

2612 TTT TTT AGG GAA GAT CTG GCC TTC CCA CAA GGG AAG GCC AGG GAA 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



2150 
_J 



R R E 
AGA AGA GAG 
R R E 



2132 FSSEQTRANSPT 

2132 TTT TCT TCA GAG GAG ACC AGA GCC AAC AGC CCC ACC 

2130 FSSEQTRANSPT 

2130 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 

2657 FSSEQTRANSPTR RE 

2657 TTT TCT TCA GAG CAG ACC AGA GCC AAC AGC CCC ACC AGA AGA GAG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



2177 
2177 
2175 
2175 
2702 



2180 



2210 



L Q V W G R 1 
CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC 



NNS LSEAG 
CTC TCA GAA GCA GGA 
N S L S E A G 



N 



CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TC^ GAA GCA GGA 



V 



w 



N 



N 



2702 CTT CAG GTT TGG GG;^. AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 



genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



2240 



2222 ADRQGTVSFSFPQIT NL4-3 genbank.SEQ 

2222 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 

2220 ADRQGTVSFSFPQIT pNL4-3,seq 

2220 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 

2747 ADRQGTVSFSFPQIT pHDMHgpin2 . seq 

2747 GCC GAT AGA CAA GGA ACT GTA TCC TTT AGC TTC CCT CAG ATC ACT 



2267 
2267 
2265 
2265 
2792 



2270 
I 



1 — 

2300 



L W Q R P L 
CTT TGG CAG CGA CCC CTC 
L W Q R P L 



VTIKIGGQL 
GTC ACA ATA AAG ATA GGG GGG CAA TTA 
VTIKIGGQL 



CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATA GGG GGG CAA TTA 



K 



2792 CTT TGG CAG CGA CCC CTC GTC ACA ATA .aiAG ATC GGT GGC CAG CTG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 



2312 
2312 
2310 
2310 
2837 



2330 
i 



K 



L 



AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 



K 



L 



D 



V 



AAG GAA GCT CTA TTA GAT ACA GGA GCA GAT GAT ACA GTA TTA GAA 
KEALLDTGADDTVLE 



2837 AAG GAG GCC CTG CTG GAC ACC GGC GCC GAC GAC ACC GTG CTG GAG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 
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A«gnm,n. Report Of Codcn Op«™iza«on (po.).MEG. u^n, C.us.a, ™,U,od with PAM250 .»due weigh, .ablo. 



2357 
2357 
2355 
2355 
2882 
2882 



T T T T T T T T T T T 
T T T ? T T 'f r T '™ ^ ^ 

GAG ATG AAC CTG CCC GGC CGC TGG AAG CCC AAG ATG aJc GGC GGC 



2402 
2402 
2400 
2400 
2927 
2927 



2420 
1_ 



ATT GGA GGT TTT ATC AAA GTA GGA CAG TAT GAT C^G ATA CTC A^A 
ATT GGA GGT TTT ATC AAA GTA AGA CAG TAT GAT C^G ATA CTC ATA 
ATC GGC GGC TTC ATC AAA GTC CGC CAG tIc GAC C^G A^C CTG aJc 



2447 
2447 
2445 
2445 
2972 
2972 



^^^^"KAIGTVLV^. 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA G^ CCT 

GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

GAG ATC TGC GGC CAC AAG GCC A^C GGC ACC GTG CTG GTG GGC CCC 



NL4-3 genbank.SEQ 
pNL4-3.sea. 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpni2.seq 



2492 
2492 
2490 
2490 
3017 
3017 



2537 
2537 
2535 
2535 
3062 
3062 



ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT C^G A^T GGC 
ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 
ACC CCC GTG AAC ATC ATC GGC CGC A^C CTG CTG ACC C^g ATC GGC 



2540 



2570 



J: ^ ^ p ^ s i i E T V i — ^ 

TGC ACT TTA AAT TTT CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 
TGC ACT TTA AAT TTT CCC ATT aJt CCT ATT GAG ACT GTA ck GTA 
TGC ACC CTG AAC TTC CCC A^C TCC CCC A^C GAG ACC GTG CCC GTG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2. seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 



2582 KLKPGMDGPKVKO 

nil T T 7 T T T 7 T T T T T ^ ^"'^ 

.530 ^ ... cc. 00. .JO 0.x OOC CC. M. OXT M. XO^O 4 

3107 ..0 CXO ..G CCC OGC .TO G.C OOC CCC M. OXC ..0 cSg TOO CCC 



NL4-3 genbank.SEQ 
pML4-3.seq 
pHDKHgpm2 . seq 
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Alignment Report of Codon Optimization (pol).MEG, using Qustal method with PAM250 residue weight table. 




2627 
2627 
2625 
2625 
3152 
3152 



r ^ ^ ^ ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA gL A^T TGT ACA G^ 

^'TEEKIKALVEICTP 
CTG AGO GAG GAG AAG ATC AAG GCC CTG GTG GAG ATC TGC ACC G^G 



NL4-3 genbatric.SEQ 
pNL4-3.seq 
pHDMHgprTi2 , seq 



2672 
2672 
2670 
2670 
3197 
3197 



2717 
2717 
2715 
2715 
3242 
3242 



M E K 
ATG GAA AAG 

M E K 
ATG GAA AAG 

M E K 
ATG GAG AAG 

1 

2720 

! 



E^KISKXGPENP 
GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 

^^KISKIGPENP 
GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AA-^ CCA 

^gkiskigpen'p 

GAG GGC AAG ATC TCC AAG ATC GGC CCC GAG AAC CC- 



2750 



y N T 
TAC AAT ACT 

Y N T 
TAC AAT ACT 

Y N T 
TAC AAC ACC 



2762 
2762 
2760 
2760 
3287 
3297 



P V F A I K K K D i r ~ 

CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

PVFAIKKKDSTK 
CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

p^faikkkdstT 

CCC GTG TTC GCC ATC AAG AAG AAG GAC TCC ACC AAG 

I 

27B0 

I 

W R K L V 5 F R E E N K R t ^ 

TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAC AGA AC^ CAA 

TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAG AGA AC- CAA 

'^RKLVDFRELNKRTo 
TGG CGC AAG CTG GTG GAC TTC C3C GAG CTG AAC AAG CGC ACC CAG 



NL4-3 genbanJc.SEQ 
pNL4-3.5eq 
pHDMHgpm2.seq 

NL4-3 genbanJc.SEQ 
PNL4-3, seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.3eq 
pHDMHgpin2 .seq 



2810 



2807 
2807 
2805 
2905 
3332 
3332 



234C 



D F W E V Q L G I ? H 
GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT 

•DFWEVQLGIPH 
GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT 

DFWEVQLGIPH 



? 
CCT 

P 
CCT 
? 



A G L 
GCA GGG TTA 

A G L 
GCA GGG TTA 
A G L 



.VL4-3 genbank.SEQ 
pNL4-3.sea 



GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 



pHDMHgpm2 . 



sec 



2852 
2852 
2850 
2850 
3377 
3377 



2870 

L_ 



^ Q i< K S V T V L~D V Q 5 A ~ 

AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GA'^ GCA TA- 

K Q K K S V T V L D V G D .A V 

AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

K Q K K S V T V L D V G D A Y 
AAG CAG AAG AAG TCC GTG ACC GTG CTG GAC GTG GGC GAC GCC TAC 



NL4-3 genbank.SEQ 
PNL4-3. seq 
pHDMHgpm2,seq 
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Alignment Report of Codon Optimization (pol).MEG. using Clustal mettiod viith PAM250 residue weight table. 




2897 
2897 
2895 
2895 
3422 
3422 



2942 
2942 
2940 
2940 
34 67 
3467 



2987 
2987 
2985 
2985 
3512 
3512 



FSVPLDKOFRKYTAF 
TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

FSVPLDKDFRKYTAF 
TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

FSVPLDKDF RKYTAF 
TTC TCC GTG CCC CTG GAC AAG GAC TTC CGC AAG TAC ACC GCC TTC 



2960 



T I P S I N N E T i G I R Y q" 

ACC ATA CCT AGT ATA PJKC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

TIPSINNETPGIRYQ 
ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

TIPSINNETPG iRYn 
ACC ATC CCC TCC ATC AAC AAC GAG ACC CCC GGC ATC CGC TAC CAG 



2990 
I 



3O20 

t.- 



^NVLP QGWKGS PA ^ ~ 
TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

^NVLPQGWKGSPAIF 
TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

YNVLPQGWKGSPAIF 
TAC AAC GTG CTG CCC CAG GGC TGG AAG GGC TCC CCC GCC ATC TTC 



3050 



3032 
3032 
3030 
3030 
3557 
3557 



3077 
3077 
3075 
3075 
3602 
3602 



Q C S M T K r L E P F R K Q JT 
CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AA- 

^^CSMTKILEPFRKQN* 
CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

QCSMTKltEPFRKQN 
CAG TGC TCC ATG ACC AAG ATC CTG GAG CCC TTC CGC AAG CAG AAC 



3080 



3110 



PCIVIYQYMDDLY ~ 
CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

PDIVIYQYMDDLYVG 
CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

PDIVIYQYMDDLYVG 
CCC GAC ATC GTG ATC TAC CAG TAC ATG GAC GAC CTG TAC GTG GGC 



NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3 .seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpni2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



3140 



3122 
3122 
3120 
3120 
3647 
3647 



SDLEIGQHRTKIEEL 
TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

SDLEIGQHRTKIEEL 
TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

SDLEIGQHRTKIEEL 
TCC GAC CTG GAG ATC GGC CAG CAC CGC ACC AAG ATC GAG GAG CTG 



NL4-3 genbank.SEQ 
pNL4-3 .seq 
pHDMHgpm2 , seq 



Fig. 9D 



wo 00/15819 



PCT/US99/20675 



17/29 



3212 
3212 
3210 
3210 
3737 
3737 



3257 
3257 
3255 
3255 
3782 
3782 



3167 
3167 
3163 
3165 
3692 
3692 



T T T T T T T T 7 T T ^ ^ 
T T T T T T T T T T '^"^ ^ 

CGC CAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC gJc A^G aJg 



CAT CAG AAA GAA OCX CCA TTC CTT TGG ATG TAT ol cJc C^T 

CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GOT tIt ol ct'c cJt 
CAC CAG AAG GAG CCC CCC TTC CTG TGG aJg gSc tIc gIg C^G Jc 



3260 
1__ 



3290 



OCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG C« gL A^G gL 
CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG Ct'g c5a gL A^C gL 
CCC GAC AAG TGG ACC GTG CAG CCC aJc OTG cJg Cc'c gJg aJg gJc 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMH gpm2 .seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 



3302 S. WTVNDlQKLVr 

]3;; T T T T T T T T T "^'^ 

3300 AGC TGG ACT GTC AAT GAC ATA CAG AAA T^A GT^G GGA aJa tJg A^T 

3827 TCC TGG ACC GTG AAC GAC ATC CAG AAG C^G G^G G^C A^G C^G A^C 



3347 
3347 
3345 
3345 
3872 
3872 




TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG cL TTA TGT 
TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA G^A AGG C^ 4 TGT 
TGG GCC TCC CAG ATC tIc GCC GGC A^C A^A G^C CGC C^G C^G TGC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2. seq 



3392 
3392 
3390 
3390 
3917 
3917 



.I^^^GTKALTEVVPT 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA C^A 

AAA CTT CTT AGG GGA ACC AAA CTA aJa gL G^A G^A C^ C^A 

K LLRGTKALTEVVpr 
AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG GTG GTG CCC CTG 



NL4-3 genbank.SEQ 
PNL4 -3. seq 
pHDMHgpm2.3eq 
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Alignment Report of Codon Optimization (pol).MEG. using Ctustal method with PAM250 residue weight table. 
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NL4-3 genbank.SEQ 
pNL4-3.seq_ 
pHDMHgpin2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 , seq 
pHDMHgp[ri2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDKHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3 .seq 
pHDMHgpm2 . seq 
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Alignmont Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 



3710 



3740 
— 1_ 



3707 TPKFKLPIQKETW ^ JT 

3707 ACT CCT AAA TTT AAA TTA CCC ATA CAA AAG GAA ACA TGG 6AA GCA 
3705 TPKFKLPIQKETWEA 
3705 ACT CCT AAA TTT AAA TTA CCC ATA CAA AAG GAA ACA TGG GAA GCA 
4232 TPRFKLPIOKETWeT 
4232 ACT CCC AAG TTC AAG CTG CCC ATC CAG AAG GAG ACC TGG GAG GCC 



3752 
3752 
3750 
3750 
4277 
4277 



3797 
3797 
3795 
3795 
4322 
4322 



3770 

L_ 



WWTEYWQA T W I F i JJ ~ 

TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

WWTEYWQA TWIPEWE 
TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

WWTEYWQATWIPEWE 
TGG TGG ACC GAG TAC TGG CAG GCC ACC TGG ATC CCC GAG TGG GAG 



3800 
l_ 



T 

3830 
— 1_ 



^ VNTPPLVKLWy Q^^I — 

TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

^VNTP PLVKLWYQLE 
TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

f'VNTPPLVKLWYQLE 
TTC GTG AAC ACC CCC CCC CTG GTG AAG CTG TGG TAC CAG CTG GAG 



3842 
3842 
3840 
3840 
4367 
4367 



3887 
3887 
3885 
3885 
4412 
4412 



K E P 
AAA GAA CCC 

K E P 
AAA GAA CCC 

K E P 
AAG GAG CCC 

1 



3860 
1 ■■- 



I I G A E T F Y V D G A^ 
ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

IIGAETFYVDGA 
ATA ATA GGA GCA GAA ACT TTC TAT GTA GAT GGG GCA 

I^GAETFYVDGA 
ATC ATC GGC GCC GAG ACC TTC TAC GTG GAC GGC GCC 



NL4-3 genbank.SEQ 
PNL4-3. seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
PNL4-3, seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpin2 .seq 

NL4-3 genbank.SEC 
pNL4-3 . seq 
pHDMHgpm2 . seq 



3890 
I 



3920 



A N R 
GCC AAT AGG 

A N R 
GCC AAT AGG 

A N R 
GCC AAC CGC 



ETKLGK AGYVTD 
GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

2TKLGKAGYVTD 
GAA ACT AAA TTA GGA AAA GCA GGA TAT GTA ACT GAC 

ETKLGKAGYVTD 
GAG ACC AAG CTG GGC AAG GCC GGC TAC GTG ACC GAC 



NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpni2 .seq 



3932 
3932 
3930 
3930 
4457 
4457 



3950 
■ 



?^GRQKVVP "1^ T D T T 15 Q~" 
AGA GGA AGA CAA AAA GTT GTC CCC CTA ACG GAC ACA ACA AAT CAG 

RGRQKVVP LTDTTNQ 
AGA GGA AGA CAA AAA GTT GTC CCC CTA ACG GAC ACA ACA AAT CAG 

RGRQKVVPLTDTTNQ 
CGC GGC CGC CAG AAG GTG GTG CCC CTG ACC GAC ACC ACC AAC CAG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2. seq 
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Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table 
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KTELQAIHLALQDSG 
AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

KTELQAIHLALQDSG 
AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

KTELQAIHLALQDSG 
AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CTG CAA GAC TCC GGC 
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NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpni27seq 

NL4-3 genbanJc.SEQ 
pNL4-3, seq 
pHDMHgpiu2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



4202 
4202 
4200 
4200 
4727 
4727 



4220 



TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 
LVSAGIRKVLFLDGI 

TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 
LVSAGIRKVLFLDGI 

CTG GTG TCC GCC GGC ATC CGC AAG GTG CTG TTC CTG GAC GGC ATC 



NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 

NL4-3 genbank.SEQ 
pKL4-3 . seq 
pHDMHgpm2 .seq 
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Alignment Report of Codon Optimization (pol).MEG, using Qustai mothod with PAM250 residue weight table. 
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NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 
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G 


K 


V 


X 


L 


V A 


V 


H 


V 


A 


4427 


ACA 


CAT 


TTA 


GAA 


GGA AAA 


GTT 


ATC 


TTG 


GTA GCA 


GTT 


CAT 


GTA 


GCC 


4425 


T 


H 


L 


E 


G 


K 


V 


I 


L 


V A 


V 


H 


V 


A 


4425 


ACA 


CAT 


TTA 


GAA 


GGA AAA 


GTT 


ATC 


TTG 


GTA GCA 


GTT 


CAT 


GTA 


GCC 


4 952 


T 


H 


L 


E 


G 


K 


V 


I 


L 


V A 


V 


H 


V 


A 


4952 


ACC 


CAC 


CTG 


GAG 


GGC 


AAG 


GTG 


ATC 


CTG 


GTG GCC 


GTG 


CAC 


GTG 


GCC 














i 

4490 
> 
















4472 


S 


G 


Y 


I 


E 


A 


E 


V 


I 


P A 


£ 


T 


G 


Q 


4472 


AGT 


GGA TAT 


ATA 


GAA 


GCA 


GAA 


GTA ATT 


CCA GCA 


GAG 


ACA 


GGG 


CAA 


4470 


S 


G 


Y 


I 


£ 


A 


£ 


V 


I 


P A 


E 


T 


G 


Q 


4470 


AGT 


GGA 


TAT 


ATA 


GAA 


GCA 


GAA 


GTA ATT 


CCA GCA 


GAG 


ACA 


GGG 


CAA 


4997 


S 


G 


Y 


I 


E 


A 


E 


V 


X 


P A 


E 


T 


G 


Q 


4997 


TCC 


GGC 


TAC 


ATC 


GAG 


GCC 


GAG 


GTG 


ATC 


CCC GCC 


GAG 


ACC 


GGC 


CAG 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3 .seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpin2 .seq 



NL4-3 genbank.SEQ 
pNL4-3 .seq 
pHDMHgpma .seq 
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Alignment Report of Codon Optimization (pol).MEG. using Clustal method with PAM250 residue weight table. 



4520 



4550 



4517 


E T 


A 


Y 


F 


L 


L 


K 


L 


A 


G 


-J , 

R 


W 


p 


V 


4517 


GAA ACA 


GCA 


TAC 


TTC 


CTC 


TTA 


AAA 


TTA 


GCA 


GGA 


AGA 


TGG 


CCA 




45X5 


E T 


A 


Y 


F 


L 


L 


K 


L 


A 


G 


R 


W 


p 


V 


4515 


GAA ACA 


GCA TAC 


TTC 


CTC 


TTA AAA TTA 


GCA 


GGA 


AGA TGG 


CCA 




5042 


E T 


A 


Y 


F 


L 


L 


K 


L 


A 


G 


R 


W 


p 


V 


5042 


GAG ACC 


GCC 


TAC 


TTC 


CTG 


CTG 


AAG 


CTG 


GCC 


GGC 


CGC 


TGG 
















4560 


















4562 


K T 


V 


H 


T 


D 


N 


G 


S 


N 


F 


T 


S 


T 

X 


T 


4562 


AAA ACA 


GTA 


CAT 


ACA 


GAC 


AAT 


GGC 


AGC 


AAT 


TTC 


ACC 


AGT 


r\S^ i 


AuA. 


4560 


K T 


V 


H 


T 


D 


N 


G 


S 


U 


F 


T 


S 


T 


T 


4560 


AAA ACA 


GTA 


CAT 


ACA 


GAC 


AAT 


GGC 


AGC 


AAT 


TTC 


ACC 


AGT 


ACT 


ACA 


5087 


K T 


V 


K 


T 


D 


N 


' G 


S 


N 


F 


T 


S 


T 


T 


5087 


AAG ACC 


GTG 


CAC 


ACC 


GAC 


AAC 


GGC 


TCC 


AAC 


TTC 


ACC 


TCC 


ACC 


ACC 




■ ■ — I 

4610 
... .1 


















" — I 

4640 

f 








4607 


V K 


A 


A 


C 


W 


W 


A 


G 


I 


K 


Q 


E 


F 


G 


4607 


GTT AAG 


GCC 


GCC 


TGT 


TGG 


TGG 


GCG 


GGG 


ATC 


AAG 


CAG 


GAA 


TTT 


GGC 


4605 


V K 


A 


A 


C 


W 


W 


A 


G 


I 


K 


Q 


E 


F 


G 


4605 


GTT AAG 


GCC 


GCC 


TGT 


TGG 


TGG 


GCG 


GGG 


ATC 


AAG 


CAG 


GAA 


TTT 


GGC 


5132 


V K 


A 


A 


C 


W 


W 


A 


G 


I 


K 


Q 


E 


F 


G 


5132 


GTG AAG 


GCC 


GCC 


TGC 


TGG 


TGG 
1 


GCC 


GGC 


ATC 


AAG 


CAG 


GAG 


TTC 


GGC 



4670 



4652 


I 


P 


Y 


N 


P 


Q 


S 


Q 


G 


V 


I 


E 


S 


M 


N 


4652 


ATT 


CCC 


TAC 


AAT 


CCC 


CAA 


AGT 


CAA 


GGA 


GTA ATA 


GAA 


TCT 


ATG 


AAT 


4650 


I 


P 


Y 


N 


P 


Q 


S 


Q 


G 


V 


I 


£ 


S 


M 


N 


4650 


ATT 


CCC 


TAC 


AAT 


CCC 


CAA 


AGT 


CAA 


GGA 


GTA ATA 


GAA 


TCT 


ATG AAT 


5177 


I 


p 


Y 


N 


P 


Q 


S 


Q 


G 


V 


I 


E 


S 


M 


N 


5177 


ATC 


CCC 


TAC 


AAC 


CCC 


CAG 


TCC 


CAG 


GGC 


GTG 


ATC 


GAG 


TCC 


ATG 


AAC 




4700 


















1 

4730 

t 








4697 


K 


E 


L 


K 


K 


I 


I 


G 


Q 


V 


R 


D 


Q 


A 


E 


4697 


AAA 


GAA 


TTA 


AAG 


AAA ATT 


ATA 


GGA 


CAG 


GTA 


AGA 


GAT 


CAG 


GCT 


GAA 


4695 


K 


E 


L 


K 


K 


I 


I 


G 


Q 


V 


R 


D 


Q 


A 


E 


4695 


AAA 


GAA 


TTA 


AAG 


AAA 


ATT 


ATA 


GGA 


CAG 


GTA 


AGA 


GAT 


CAG 


GCT 


GAA 


5222 


K 


E 


L 


K 


K 


I 


I 


G 


Q 


V 


R 


D 


Q 


A 


E 


5222 


AAG 


GAG 


CTG 


AAG 


AAG 


ATC 


ATC 


GGC 


CAA 


GTC 


CGC 


GAC 


CAG 


GCC 


GAG 














1 

4760 

1 


















4742 


H 


L 


K 


T 


A 


V 


Q 


M 


A 


V 


F 


I 


H 


N 


F 


4742 


CAT 


CTT 


AAG 


ACA 


GCA 


GTA 


CAA ATG 


GCA 


GTA 


TTC 


ATC 


CAC 


AAT 


TTT 


4740 


H 


L 


K 


T 


A 


V 


Q 


M 


A 


V 


F 


I 


H 


N 


F 


4740 


CAT 


CTT 


AAG 


ACA 


GCA 


GTA 


CAA 


ATG 


GCA 


GTA 


TTC 


ATC 


CAC 


AAT 


TTT 


5267 


H 


L 


K 


T 


A 


V 


Q 


M 


A 


V 


F 


I 


H 


N 


F 


5267 


CAC 


CTG 


AAG 


ACC 


GCC 


GTG. 


CAG 


ATG 


GCC 


GTG 


TTC 


ATC 


CAC 


AAC 


TTC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgptn2 .seq 

NL4-3 genbank.SEQ 
pNL4-3, seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgprn2. seq 



NL4-3 genbank.SEQ 
pNIj4-3.seq 
pHDMHgpm2 . seq 
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Aiignmwt Report of Codon OpHmiration (pol).MEG. using Clustal method with PAM250 



residue weight table. 



4787 
4787 
478S 
4785 
5312 
5312 



AM AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA aS. aJa 

AAA AGA AAA GGG GGG ATT GGG GGG tIc AGT GGG <L aL aL 

^^<SYSAGEo 
AAG CGC AAG GGC GGC ATC GGC GGC TAC TCC GCC GGC GAG CGC A^C 



NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpm2.seq 



4 832 
4 832 
4830 
4830 
5357 
5357 



CTA GAC ATA ATA GCA ACA GAC ATA CAA A^ aL gI. t5:a cJv aL 
CTA GAC ATA ATA GCA ACA GAC ATA C^A ACT A^. tJa cL aL 

«^G GAC ATC ATC GCC ACC gJc aJc cJg ACC aIg gIg cJg cJg aJg 



4880 
I 



4877 
4877 
4875 
4875 
5402 
54 02 



4910 
- ' 



CAA ATT ACA AAA ATT <^ aJt Tt't CGG G^T tIt tIc AGG gJc AGC 
CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG gJc AGC 

CAG ATC ACC AAG ATC CAG AAC TTC CGC GTG TAC TAC CGC G^^C TCC 



4922 
4922 
4920 



4940 
' 



,^ " f V W K G P A E 1 « K F- 

AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG A^. GGT 

.920 AGA GAT CCA GTT TGG AAA GGA CCA GC. AAG C^C CTC TGG ^ GGT 

CGC GAC CCC GTG TGG AAG GGC CCC GCC aJg C^G Ct'g tL A^G GGC 



5447 



NL4-3 genbank.SEQ 
pNL4-3,seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 

NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 .seq 



4967 
4967 
4965 
4 965 
5492 
5492 



4970 
' 



5000 



^ ^ A V V I Q B 5 i K V ^ 

GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

GAG GGC GCC GTG GTG ATC CAG GAC AAC TCC GAC ATC AAG GTG GTG 



5012 
5012 
5010 
5010 
5537 
5537 



5030 
t 



^ ^ ^ ^ A K I i n D y G ? n 

CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA aJa C^G aJg 

CCA AGA AGA AAA GCA AAG A^C A^C Ag'g GAT tL gIv aL cSg aJg 

CCC CGC CGC AAG GCC AAG ATC ATC CGC GAC TAC GGC AAG cJg aJg 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpin2 . seq 

NL4-3 genbank.SEQ 

pNL4-3.seq 

pHDMHgpmS.seq 
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Alignment Report of Codon Opdmlzation (pol).MEG, using Qustal method vwlh PAM250 residue weight table. 



5060 



5090 



J _i 



5057 A G D D C V A S R Q D E D ' 

S0S7 GCA GGT GAT GAT TGT GTG GCA AST AGA CAG GAT GAG GAT IAa" genbank.SEO 

SOSS AGDDCVASRQDED fci ■ PN14-3 

SOSS GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT ' 

5582 AGDDCVASRQDED 

5582 GCC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAG GAC Mi P«DMHgpm2 . seq 
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AGCTTGGCCC ATTGCATACG TTGTATCCAT ATCATAATAT GTACATTTAT ATTGGCTCAT 60 
GTCCAACATT ACCGCCATGT TGACATTGAT TATTGACTAG TTATTAATAG TAATCAATTA 120 
CGGGGTCATT AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT ACGGTAAATG 180 

GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTGAC GTCAATAATG ACGTATGTTC 240 

CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA 300 

CTGCCCACTT GGCAGTACAT CAAGTGTATC ATATGCCAAG TACGCCCCCT ATTGACGTCA 3 60- 

ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG GACTTTCCTA 420 

CTTGGCAGTA CATCTACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT 480 

ACATCAATGG GCGTGGATAG CGGTTTGACT CACGGGGATT TCCAAGTCTC CACCCCATTG 540 

ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA 600 

ACTCCGCCCC ATTGACGCAA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC TATATAAGCA 660 

GAGCTCGTTT AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC 720 

ATAGAAGACA CCGGGACCGA TCCAGCCTCC CCTCGAAGCT GATCCTGAGA ACTTCAGGGT 780 

GAGTCTATGG GACCCTTGAT GTTTTCTTTC CCCTTCTTTT CTATGGTTAA GTTCATGTCA 840 

TAGGAAGGGG AGAAGTAACA GGGTACACAT ATTGACCAAA TCAGGGTAAT TTTGCATTTG 900 

TAATTTTAAA AAATGCTTTC TTCTTTTAAT ATACTTTTTT GTTTATCTTA TTTCTAATAC 960 

TTTCCCTAAT CTCTTTCTTT CAGGGCAATA ATGATACAAT GTATCATGCC TCTTTGCACC 102 0 

ATTCTAAAGA ATAACAGTGA TAATTTCTGG GTTAAGGCAA TAGCAATATT TCTGCATATA 1080 

AATATTTCTG CATATAAATT GTAACTGATG TAAGAGGTTT CATATTGCTA ATAGCAGCTA 1140 

CAATCCAGCT ACCATTCTGC TTTTATTTTA TGGTTGGGAT AAGGCTGGAT TATTCTGAGT 1200 

CCAAGCTAGG CCCTTTTGCT AATCATGTTC ATACCTCTTA TCTTCCTCCC ACAGCTCCTG 1260 

GGCAACGTGC TGGTCTGTGT GCTGGCCCAT CACTTTGGCA AAGAATTCTA GACTGCCATG 132 0 

GGCGCCCGCG CCTCCGTGCT GTCCGGCGGC GAGCTGGACA AGTGGGAGAA GATCCGCCTG 13 80 

CGCCCCGGCG GCAAGAAGCA GTACAAGCTG AAGCACATCG TGTGGGCCTC CCGCGAGCTG 1440 

GAGCGCTTCG CCGTGAACCC CGGCCTGCTG GAGACCTCCG AGGGCTGCCG CCAGATCCTG 1500 

GGCCAGCTGC AGCCCTCCCT GCAAACCGGC TCCGAGGAGC TGCGCTCCCT GTACAACACC 1560 

ATCGCCGTGC TGTACTGCGT GCACCAGCGC ATCGACGTGA AGGACACCAA GGAGGCCCTG 162 0 

GACAAGATCG AGGAGGAGCA GAACAAGTCC AAGAAGAAGG CCCAGCAGGC CGCCGCCGAC 1680 

ACCGGCAACA ACTCCCAGGT GTCCCAGAAC TACCCCATCG TGCAGAACCT GCAGGGCCAG 1740 

ATGGTGCACC AGGCCATCTC CCCCCGCACC CTGAACGCCT GGGTGAAGGT GGTGGAGGAG 1800 

AAGGCCTTCT CCCCCGAAGT CATCCCCATG TTCTCCGCCC TGTCCGAGGG CGCCACCCCC 1860 

CAGGACCTGA ACACCATGCT GAACACCGTG GGCGGCCACC AGGCCGCCAT GCAGATGCTG 1920 

AAGGAGACCA TCAACGAGGA GGCCGCCGAG TGGGACCGCC TGCACCCCGT GCACGCCGGC 1980 

CCCATCGCCC CCGGCCAGAT GCGCGAGCCC CGCGGCTCCG ACATCGCCGG CACCACCTCC 2040 

ACCCTGCAAG AGCAGATCGG CTGGATGACC CACAACCCCC CCATCCCCGT GGGCGAGATC 2100 

TACAAGCGCT GGATCATCCT GGGCCTGAAC AAGATCGTGC GCATGTACTC CCCCACCTCC 2160 

ATCCTGGACA TCCGCCAGGG CCCCAAGGAG CCCTTCCGCG ACTACGTGGA CCGCTTCTAC 2220 

AAGACCCTGC GCGCCGAGCA GGCCTCCCAG GAGGTAAAGA ACTGGATGAC CGAGACCCTG 2280 

CTGGTGCAGA ACGCCAACCC CGACTGCAAG ACCATCCTGA AGGCCCTGGG CCCCGGCGCC 2340 

ACCCTGGAGG AGATGATGAC CGCCTGCCAG GGCGTGGGCG GCCCCGGCCA CAAGGCCCGC 2400 

GTGCTGGCCG AGGCCATGTC CCAAGTCACC AACCCCGCCA CCATCATGAT CCAGAAGGGC 2460 

AACTTCCGCA ACCAGCGCAA GACCGTGAAG TGCTTCAACT GCGGCAAGGA GGGCCACATC 2520 

GCCAAGAACT GCCGCGCCCC CCGCAAGAAG GGCTGCTGGA AGTGCGGCAA GGAGGGCCAC 2580 

CAGATGAAAG ATTGTACTGA GAGACAGGCT AATTTTTTAG GGAAGATCTG GCCTTCCCAC 2640 

AAGGGAAGGC CAGGGAATTT TCTTCAGAGC AGACCAGAGC CAACAGCCCC ACCAGAAGAG 2700 

AGCTTCAGGT TTGGGGAAGA GACAACAACT CCCTCTCAGA AGCAGGAGCC GATAGACAAG 27 6 0 

GAACTGTATC CTTTAGCTTC CCTCAGATCA CTCTTTGGCA GCGACCCCTC GTCACAATAA 282 0 
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AGATCGGTGG CCAGCTGAAG GAGGCCCTGC TGGACACCGG CGCCGACGAC ACCGTGCTGG 2 880 

AGGAGATGAA CCTGCCCGGC CGCTGGAAGC CCAAGATGAT CGGCGGCATC GGCGGCTTCA 2940 

TCAAAGTCCG CCAGTACGAC CAGATCCTGA TC6AGATCTG CGGCCACAAG GCCATCGGCA 3000 

CCGTGCTGGT GGGCCCCACC CCCGTGAACA TCATCGGCCG CAACCTGCTG ACCCAGATCG 3060 

GCTGCACCCT GAACTTCCCC ATCTCCCCCA TCGAGACCGT GCCCGTGAAG CTGAAGCCCG 3120 

GCATGGACGG CCCCAAAGTC AAGCAGTGGC CCCTGACCGA GGAGAAGATC AAGGCCCTGG 3180 

TGGAGATCTG CACCGAGATG GAGAAGGAGG GCAAGATCTC CAAGATCGGC CCCGAGAACC 3240" 

CCTACAACAC CCCCGTGTTC GCCATCAAGA AGAAGGACTC CACCAAGTGG CGCAAGCTGG 3300 

TGGACTTCCG CGAGCTGAAC AAGCGCACCC AGGACTTCTG GGAGGTGCAG CTGGGCATCC 3360 

CCCACCCCGC CGGCCTGAAG CAGAAGAAGT CCGTGACCGT GCTGGACGTG GGCGACGCCT 342 0 

ACTTCTCCGT GCCCCTGGAC AAGGACTTCC GCAAGTACAC CGCCTTCACC ATCCCCTCCA 3480 

TCAACAACGA GACCCCCGGC ATCCGCTACC AGTACAACGT GCTGCCCCAG GGCTGGAAGG 3540 

GCTCCCCCGC CATCTTCCAG TGCTCCATGA CCAAGATCCT GGAGCCCTTC CGCAAGCAGA 3 600 

ACCCCGACAT CGTGATCTAC CAGTACATGG ACGACCTGTA CGTGGGCTCC GACCTGGAGA 3660 

TCGGCCAGCA CCGCACCAAG ATCGAGGAGC TGCGCCAGCA CCTGCTGCGC TGGGGCTTCA 372 0 

CCACCCCCGA CAAGAAGCAC CAGAAGGAGC CCCCCTTCCT GTGGATGGGC TACGAGCTGC 3780 

ACCCCGACAA GTGGACCGTG CAGCCCATCG TGCTGCCCGA GAAGGACTCC TGGACCGTGA 3840 

ACGACATCCA GAAGCTGGTG GGCAAGCTGA ACTGGGCCTC CCAGATCTAC GCCGGCATCA 3900 

AAGTCCGCCA GCTGTGCAAG CTGCTGCGCG GCACCAAGGC CCTGACCGAG GTGGTGCCCC 3 9 60 

TGACCGAGGA GGCCGAGCTG GAGCTGGCCG AGAACCGCGA GATCCTGAAG GAGCCCGTGC 4020 

ACGGCGTGTA CTACGACCCC TCCAAGGACC TGATCGCCGA GATCCAGAAG CAGGGCCAGG 4080 

GCCAGTGGAC CTACCAGATC TACCAGGAGC CCTTCAAGAA CCTGAAGACC GGCAAATACG 4140 

CCCGCATGAA GGGCGCCCAC ACCAACGACG TGAAGCAGCT GACCGAGGCC GTGCAGAAGA 4200 

TCGCCACCGA GTCCATCGTG ATCTGGGGCA AGACTCCCAA GTTCAAGCTG CCCATCCAGA 4260 

AGGAGACCTG GGAGGCCTGG TGGACCGAGT ACTG6CAGGC CACCTGGATC CCCGAGTGGG 4320 

AGTTCGTGAA CACCCCCCCC CTGGTGAAGC TGTGGTACCA GCTGGAGAAG GAGCCCATCA 4380 

TCGGCGCCGA GACCTTCTAC GTGGACGGCG CCGCCAACCG CGAGACCAAG CTGGGCAAGG 4440 

CCGGCTACGT GACCGACCGC GGCCGCCAGA AGGTGGTGCC CCTGACCGAG ACCACCAACC 4500 

AGAAGACCGA GCTGCAGGCC ATCCACCTGG CCCTGCAAGA CTCCGGCCTG GAGGTGAACA 4560 

TCGTGACCGA CTCCCAGTAT GCATTGGGCA TCATCCAGGC CCAGCCCGAC AAGTCCGAGT 4620 

CCGAGCTGGT GTCCCAGATC ATCGAGCAGC TGATCAAGAA GG AGAAGGTG TACCTGGCCT 4680 

GGGTGCCCGC CCACAAGGGC ATCGGCGGCA ACGAGCAGGT GGACAAGCTG GTGTCCGCCG 4740 

GCATCCGCAA GGTGCTGTTC CTGGACGGCA TCGACAAGGC CCAGGAGGAG CACGAGAAGT 4800 

ACCACTCCAA CTGGCGCGCC ATGGCCTCCG ACTTCAACCT GCCCCCCGTG GTGGCCAAGG 4860 

AGATCGTGGC CTCCTGCGAC AAGTGCCAGC TGAAGGGCGA GGCCATGCAC GGCCAGGTGG 4920 

ACTGCTCCCC CGGCATCTGG CAGCTGGACT GCACCCACCT GGAGGGCAAG GTGATCCTGG 4980 

TGGCCGTGCA CGTGGCCTCC GGCTACATCG AGGCCGAGGT GATCCCCGCC GAGACCGGCC 5040 

AGGAGACCGC CTACTTCCTG CTGAAGCTGG CCGGCCGCTG GCCCGTGAAG ACCGTGCACA 5100 

CCGACAACGG CTCCAACTTC ACCTCCACCA CCGTGAAGGC CGCCTGCTGG TGGGCCGGCA 5160 

TCAAGCAGGA GTTCGGCATC CCCTACAACC CCCAGTCCCA GGGCGTGATC GAGTCCATGA 5220 

ACAAGGAGCT GAAGAAGATC ATCGGCCAAG TCCGCGACCA GGCCGAGCAC CTGAAGACCG 5280 

CCGTGCAGAT GGCCGTGTTC ATCCACAACT TCAAGCGCAA GGGCGGCATC GGCGGCTACT 5340 

CCGCCGGCGA GCGCATCGTG GACATCATCG CCACCGACAT CCAGACCAAG GAGCTGCAGA 5400 

AGCAGATCAC CAAGATCCAG AACTTCCGCG TGTACTACCG CGACTCCCGC GACCCCGTGT 5460 

GGAAGGGCCC CGCCAAGCTG C7GTGGAAGG GCGAGGGCGC CGTGGTGATC CAGGACAACT 5520 

CCGACATCAA GGTGGTGCCC CGCCGCAAGG CCAAGATGAT CCGCGACTAC GGCAAGCAGA 5580 

TGGCCGGCGA CGACTGCGTG GCCTCCCGCC AGGACGAGGA CTAACACATG GAAAAGATTA 5640 
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GTAAAACACC ATAGGCCGCT CTAGAGGATC CAAGCTTATC GATACCGTCG ACCTCGAGGG 5700 

CCCAGATCTA ATTCACCCCA CCAGTGCAGG CTGCCTATCA GAAAGTGGTG GCTGGTGTGG 5760 

CTAATGCCCT GGCCCACAAG TATCACTAAG CTCGCTTTCT TGCTGTCCAA TTTCTATTAA 5820 

AGGTTCCTTT GTTCCCTAAG TCCAACTACT AAACTGGGGG ATATTATGAA GGGCCTTGAG 5880 

CATCTGGATT CTGCCTAATA AAAAACATTT ATTTTCATTG CAATGATGTA TTTAAATTAT 5940 

TTCTGAATAT TTTACTAAAA AGGGAATGTG GGAGGTCAGT GCATTTAAAA CATAAAGAAA 6000 

TGAAGAGCTA GTTCAAACCT TGGGAAAATA CACTATATCT TAAACTCCAT GAAAGAAGGT 6060 — 

GAGGCTGCAA ACAGCTAATG CACATTGGCA ACAGCCCCTG ATGCCTATGC CTTATTCATC 6120 

CCTCAGAAAA GGATTCAAGT AGAGGCTTGA TTTGGAGGTT AAAGTTTTGC TATGCTGTAT 6180 

TTTACATTAC TTATTGTTTT AGCTGTCCTC ATGAATGTCT TTTCACTACC CATTTGCTTA 6240 

TCCTGCATCT CTCAGCCTTG ACTCCACTCA GTTCTCTTGC TTAGAGATAC CACCTTTCCC 6300 

CTGAAGTGTT CCTTCCATGT TTTACGGCGA GATGGTTTCT CCTCGCCTGG CCACTCAGCC 6360 

TTAGTTGTCT CTGTTGTCTT ATAGAGGTCT ACTTGAAGAA GGAAAAACAG GGGGCATGGT 6420 

TTGACTGTCC TGTGAGCCCT TCTTCCCTGC CTCCCCCACT CACAGTGACC CGGAATCCCT 6480 

CGACATGGCA GTCTAGATCA TTCTTGAAGA CGAAAGGGCC TCGTGATACG CCTATTTTTA 6540 

TAGGTTAATG TCATGATAAT AATGGTTTCT TAGACGTCAG GTGGCACTTT TCGGGGAAAT 6600 

GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA TCCGCTCATG 6660 

AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTAT GAGTATTCAA 6720 

CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC 6780 

CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC 6840 

ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT 6900 

CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TATTATCCCG TATTGACGCC 6960 

GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTACTCA 7020 

CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA GAGAATTATG CAGTGCTGCC 7080 

ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA CAACGATCGG AGGACCGAAG 7140 

GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA CTCGCCTTGA TCGTTGGGAA 7200 

CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA CCACGATGCC TGTAGCAATG 7260 

GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA 7320 

TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 7380 

GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC GTGGGTCTCG CGGTATCATT 7440 

GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC GACGGGGAGT 7500 

CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA TAGGTGCCTC ACTGATTAAG 7560 

CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT AAAACTTCAT 7620 

TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT 7680 

TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT 7740 

TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA 7800 

GCGGTGGTTT GTTTGCCGGA TGAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 7860 

AGCAGAGCGC AGATACCAAA TACTGTTCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC 7920 

AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 7980 

GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT ACCGGATAAG 8040 

GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC 8100 

TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA GCGCCACGCT TCCCGAAGG6 8160 

AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG 8220 

CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 8280 

GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC TATGGAAAAA CGCCAGCAAC 8340 

GGATGCGCCG CGTGCGGCTG CTGGAGATGG CGGACGCGAT GGATATGTTC TGCCAAGGGT 8400 

TGGTTTGCGC ATTCACAGTT CTCCGCAAGA ATTGATTGGC TCCAATTCTT GGAGTGGTGA 8460 
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ATCCGTTAGC GAGGTGCCGC CGGCTTCCAT TCAGGTCGAG GTGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGGGGAGGC AGACAAGGTA TAGGGCGGCG CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCCGAGG CGGCATAAAT CCCCGTGACG ATCAGCGGTC CAATGATCGA 8640 

AGTTAGGCTG GTAAGAGCCG CGAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATCTAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGGG CATCCCGATG CCGCCGGAAG CGAGAAGAAT 8760 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCTAGGCCTC 8820 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCAGAGGC CGAGGCGGCC TCGGCCTCTG 8880 
CATAAATAAA AAAAATTAGT CAGCCATG 8908 
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